Foundations of XML Based on Logic and Automata: A Snapshot
نویسنده
چکیده
XML query and schema languages have some obvious connections to Formal Language Theory. For example, Document Type Definitions (DTDs) can be viewed as tree grammars and use regular expressions, XML Schemas resemble tree automata. Likewise, there are immediate links to Logic, e.g., through the classical characterization of regular tree languages by monadic second-order logic. It is therefore not surprising that concepts from Logic and Formal Language Theory played an important role in the development of the theoretical foundations of XML query and schema languages. For example, they helped to analyze the expressiveness of languages, to understand the restrictions posed by the W3C standards, and to develop algorithms for various purposes, including efficient evaluation and static analysis. However, methods from Logic and Formal Languages have not merely been applied to XML theory, the fertilization took place both ways. XML theory posed a lot of new challenges for Logic and Formal Language Theory and triggered various new research lines, e.g., the study of deterministic regular expressions and the development of automata for trees with data values. The aim of the talk at FoIKS 2012 is to present some of the fundamental connections between XML query and schema languages and Logic and Formal Language Theory, to report on recent developments in the area, and to highlight some current directions of research. This accompanying paper is a kind of annotated bibliography for that talk. The Extensible Markup Language (XML) was introduced in 1996. It was derived as a “simple dialect of SGML” with the purpose “to enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML” [26]. Meanwhile XML is the standard data format for the exchange of data through the internet. For the current standard of XML 1.0, see [27]. For most theoretical purposes, XML documents can be adequately modelled as finite unranked, ordered, labelled trees with additional data values. XML query and schema languages and many other aspects of XML processing have ⋆ We acknowledge the financial support of the Future and Emerging Technologies (FET) programme within the Seventh Framework Programme for Research of the European Commission, under the FET-Open grant agreement FOX, number FP7ICT-233599 some obvious connections to Formal Language Theory (henceforth, FLT ) and Logics. The very first such connection is given by the common requirement that XML documents that are used in a certain context should be valid with respect to a given document type definition (DTD). Valid XML documents can be seen as derivation trees with respect to an extended context-free grammar [7]. In formal language terminology, the set of valid documents for a given DTD is just a tree language. DTD is “the built-in” schema language of the XML standard but there are various stronger schema languages, most notably XML Schema [38], the W3C standard, and Relax NG [33]. The sets of XML documents that can be described with them correspond to regular tree languages and therefore are strongly related to finite tree automata. A classification of XML schema languages from the point of view of FLT has been given in [73]. Here, we can already see a pattern that frequently occurs in theoretical XML research: a concept in the XML world (schema languages) is modelled by a concept from FLT (regular tree languages). However, to satisfy the needs of the XML side, the classical FLT notion has to be adapted: classically, the labels of nodes in regular tree languages come with a fixed rank. E.g., the label a of a node v might indicate that v is a binary node, that is, it has two children nodes. In particular, in trees of such languages there is a uniform bound on the number of children nodes that a node may have. In XML documents, there is no such (a priori) bound, therefore the theory of regular languages had to be extended to regular languages of unranked trees. To this end, tree automata have been invented that can handle trees languages without a rank bound [72, 30, 74] (see also [81] for a survey on automata models for XML). This paper is about the interplay between XML Theory on one hand and FLT and Logic on the other hand. We will first survey applications of FLT and Logic in XML Theory. We will see examples highlighting – how the expressiveness of XML languages can be analyzed and characterized by logics, – how Formal Language Theory can help to understand restrictions posed by W3C standards, and – how concepts from FLT can be used to device evaluation and static analysis algorithms for various XML languages. However, the relationship between XML Theory, FLT and Logic is not oneway. Investigations in XML Theory brought up new challenges for the latter and we will sketch some cases where such challenges triggered new research lines in FLT and Logic themselves. Due to a bounded time budget, this paper can only highlight these connections. By no means it is meant as a complete account. Furthermore, it is biased as it is strongly influenced by the own research of the author and his closer colleagues. I apologize for ignoring important papers and even complete lines of research. Another shortcoming of the paper is that it is by no means self contained. It does not even define the most simple notions in any detail. It is rather
منابع مشابه
TREE AUTOMATA BASED ON COMPLETE RESIDUATED LATTICE-VALUED LOGIC: REDUCTION ALGORITHM AND DECISION PROBLEMS
In this paper, at first we define the concepts of response function and accessible states of a complete residuated lattice-valued (for simplicity we write $mathcal{L}$-valued) tree automaton with a threshold $c.$ Then, related to these concepts, we prove some lemmas and theorems that are applied in considering some decision problems such as finiteness-value and emptiness-value of recognizable t...
متن کاملAutomata, Logic, and XML
We survey some recent developments in the broad area of automata and logic which are motivated by the advent of XML. In particular, we consider unranked tree automata, tree-walking automata, and automata over infinite alphabets. We focus on their connection with logic and on questions imposed by XML.
متن کاملOptimization of Quantum Cellular Automata Circuits by Genetic Algorithm
Quantum cellular automata (QCA) enables performing arithmetic and logic operations at the molecular scale. This nanotechnology promises high device density, low power consumption and high computational power. Unlike the CMOS technology where the ON and OFF states of the transistors represent binary information, in QCA, data is represented by the charge configuration. The primary and basic devic...
متن کاملExplaining the Level of Human Thought in the Parallel Civilizations Based on Formal Structure and Visual Imagination Formed in Mythical Narratives
Myth, like any other form of narrative, has an undeniable role in visual imagination based on the foundations of mythical thought. Ernst Cassirer, by recovering the fundamental principles of mythical thought, brings against them to the foundations of contemporary rational thought and defines the fundamental features of mythical thought as compared to modern rational thought. He also believes t...
متن کاملNovel Defect Terminolgy Beside Evaluation And Design Fault Tolerant Logic Gates In Quantum-Dot Cellular Automata
Quantum dot Cellular Automata (QCA) is one of the important nano-level technologies for implementation of both combinational and sequential systems. QCA have the potential to achieve low power dissipation and operate high speed at THZ frequencies. However large probability of occurrence fabrication defects in QCA, is a fundamental challenge to use this emerging technology. Because of these vari...
متن کامل